Data Conversions ======================== Modeling your data and getting it in the right shape may require various conversions, depending your starting point and/or your goal. In ``bnlearn`` various functionalities are readily implemented to make conversions *from* or *to* the **adjacency matrix** or **vectors**. Available functionalities: * adjmat2dict : Convert adjacency matrix to dictionary. * adjmat2vec : Convert adjacency matrix into vector with source and target. * vec2adjmat : Convert source-target edges with its weights into an adjacency matrix. * dag2adjmat : Convert model into adjacency matrix. * vec2df : Convert source-target edges into sparse dataframe. Adjacency matrix ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The adjacency matrix is used to store relationships across source-target variables (nodes) with its edges. In graph theory, a square matrix is used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. ``bnlearn`` outputs an adjacency matrix in some functionalities. Values 0 or False indicate that nodes are not connected whereas pairs of vertices with value >0 or True are connected. **Importing a DAG** Extracting adjacency matrix from imported DAG: .. code-block:: python # Import library import bnlearn as bn # Import DAG model = bn.import_DAG('sachs') # Show the retrieved adjacency matrix for Sachs: model['adjmat'] # print print(model['adjmat']) Reading the table from left to right, we see that gene Erk is connected to Akt in a directed manner. This indicates that Erk influences gene Ark but not the otherway arround because gene Akt does not show a edge with Erk. In this example form, there may be a connection at the "...". .. table:: +------+-----+------+------+------+------+-----+-----+------+------+------+------+ | | Erk| Akt| PKA| Mek| Jnk| ... | Raf| P38| PIP3| PIP2| Plcg| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |Erk |False| True | False| False| False| ... |False| False| False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |Akt |False| False| False| False| False| ... |False| False| False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |PKA |True | True | False| True | True | ... |True | True | False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |Mek |True | False| False| False| False| ... |False| False| False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |Jnk |False| False| False| False| False| ... |False| False| False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |PKC |False| False| True | True | True | ... |True | True | False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |Raf |False| False| False| True | False| ... |False| False| False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |P38 |False| False| False| False| False| ... |False| False| False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |PIP3 |False| False| False| False| False| ... |False| False| False| True | False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |PIP2 |False| False| False| False| False| ... |False| False| False| False| False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ |Plcg |False| False| False| False| False| ... |False| False| True | True | False| +------+-----+------+------+------+------+-----+-----+------+------+------+------+ Vector ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The **vector** is used to store relationships based on source-target variables (nodes), and with its weigths. An example is illustrated below for which edges are defined when weights are True or a number >=1. .. table:: +------------+------------+---------+ | source | target | weight | +------------+------------+---------+ | Cloudy | Sprinkler | True | +------------+------------+---------+ | Cloudy | Rain | True | +------------+------------+---------+ | Sprinkler | Wet_Grass | True | +------------+------------+---------+ | Rain | Wet_Grass | True | +------------+------------+---------+ adjmat2vec ^^^^^^^^^^^^ Converting an adjacency matrix into vector with :func:`bnlearn.bnlearn.adjmat2vec` .. code-block:: python import bnlearn as bn # Load DAG DAG = bn.import_DAG('Sprinkler') # Convert adjmat to vector: vector = bn.adjmat2vec(DAG['adjmat']) .. table:: +------------+------------+---------+ | source | target | weight | +------------+------------+---------+ | Cloudy | Sprinkler | True | +------------+------------+---------+ | Cloudy | Rain | True | +------------+------------+---------+ | Sprinkler | Wet_Grass | True | +------------+------------+---------+ | Rain | Wet_Grass | True | +------------+------------+---------+ vec2adjmat ^^^^^^^^^^^^ Converting the created vector in the example above back into an adjacency matrix with :func:`bnlearn.bnlearn.vec2adjmat` .. code-block:: python import bnlearn as bn # Convert vector back to adjmat. adjmat = bn.vec2adjmat(vector['source'], vector['target'], weights=vector['weight']) .. table:: +-----------+--------+-------------+-------------+----------+ | source | Rain | Sprinkler | Wet_Grass | Cloudy | +===========+========+=============+=============+==========+ | Rain | 0 | 0 | 1 | 0 | +-----------+--------+-------------+-------------+----------+ | Sprinkler | 0 | 0 | 1 | 0 | +-----------+--------+-------------+-------------+----------+ | Wet_Grass | 0 | 0 | 0 | 0 | +-----------+--------+-------------+-------------+----------+ | Cloudy | 1 | 1 | 0 | 0 | +-----------+--------+-------------+-------------+----------+ adjmat2dict ^^^^^^^^^^^^ Convert adjacency matrix to dictionary with :func:`bnlearn.bnlearn.adjmat2dict` .. code-block:: python # Import library import bnlearn as bn # Load DAG DAG = bn.import_DAG('Sprinkler') # Convert adjmat to vector: adjmat_dict = bn.adjmat2dict(DAG['adjmat']) # print print(adjmat_dict) # {'Cloudy': ['Sprinkler', 'Rain'], # 'Sprinkler': ['Wet_Grass'], # 'Rain': ['Wet_Grass'], # 'Wet_Grass': []} dag2adjmat ^^^^^^^^^^^^ Convert model into adjacency matrix with :func:`bnlearn.bnlearn.dag2adjmat` .. code-block:: python # Import library import bnlearn as bn # Load DAG DAG = bn.import_DAG('Sprinkler') # Extract edges from model and store in adjacency matrix adjmat=bn.dag2adjmat(DAG['model']) .. table:: +-----------+--------+-------------+-------------+----------+ | source | Rain | Sprinkler | Wet_Grass | Cloudy | +===========+========+=============+=============+==========+ | Rain | 0 | 0 | 1 | 0 | +-----------+--------+-------------+-------------+----------+ | Sprinkler | 0 | 0 | 1 | 0 | +-----------+--------+-------------+-------------+----------+ | Wet_Grass | 0 | 0 | 0 | 0 | +-----------+--------+-------------+-------------+----------+ | Cloudy | 1 | 1 | 0 | 0 | +-----------+--------+-------------+-------------+----------+ vec2df ^^^^^^^^^^^^ Convert edges between source and taget into a dataframe based on the weight with :func:`bnlearn.bnlearn.vec2df` For demonstration purposes, A small example is created below for which can be seen that the weights are indicative for the number of rows; a weight of 2 will result that a row with the edge is created 2 times. .. code-block:: python # Import library import bnlearn as bn # Create source-target edges with its weights source=['Cloudy','Cloudy','Sprinkler','Rain'] target=['Sprinkler','Rain','Wet_Grass','Wet_Grass'] weights=[1,2,1,3] # Convert into sparse dataframe. df = bn.vec2df(source, target, weights=weights) .. table:: +----+----------+--------+-------------+-------------+ | | Cloudy | Rain | Sprinkler | Wet_Grass | +====+==========+========+=============+=============+ | 0 | 1 | 0 | 1 | 0 | +----+----------+--------+-------------+-------------+ | 1 | 1 | 1 | 0 | 0 | +----+----------+--------+-------------+-------------+ | 2 | 1 | 1 | 0 | 0 | +----+----------+--------+-------------+-------------+ | 3 | 0 | 0 | 1 | 1 | +----+----------+--------+-------------+-------------+ | 4 | 0 | 1 | 0 | 1 | +----+----------+--------+-------------+-------------+ | 5 | 0 | 1 | 0 | 1 | +----+----------+--------+-------------+-------------+ | 6 | 0 | 1 | 0 | 1 | +----+----------+--------+-------------+-------------+ To demonstrate the full functionality A larger example can be loaded containing 352 edges from the book A Storm of Swords. The results is that 107 unique names are extracted with 4324 edges. This dataframe can for example be an input for structure learning approaches. .. code-block:: python # Import library import bnlearn as bn # Load large example with source-target edges from the book A Storm of Swords vec = bn.import_example("stormofswords") # Convert into sparse dataframe. df = bn.vec2df(vec['source'], vec['target'], weights=vec['weight']) # sparse matrix: print(df.shape) # (4324, 107) .. include:: add_bottom.add